vision AI AI News List | Blockchain.News
AI News List

List of AI News about vision AI

Time Details
2025-12-22
10:35
Next-Token Prediction in Vision AI: New Training Method Drives 83.8% ImageNet Accuracy and Strong Transfer Learning

According to @SciTechera, a new AI training approach applies next-token prediction—commonly used in language models—to Vision AI by treating visual embeddings as sequential tokens. This method for Vision Transformers (ViTs) eliminates the need for pixel reconstruction or complex contrastive losses and leverages unlabeled data. Results show a ViT-Base model achieves 83.8% top-1 accuracy on ImageNet-1K after fine-tuning, rivalling more complex self-supervised techniques (source: SciTechera, https://x.com/SciTechera/status/2003038741334741425). The study also demonstrates strong transfer learning on semantic segmentation tasks like ADE20K, indicating that the model captures meaningful visual structures instead of just memorizing patterns. This scalable approach opens new business opportunities for cost-effective and flexible AI vision systems in industries such as healthcare, manufacturing, and autonomous vehicles.

Source
2025-12-07
13:57
Gemini 3 Pro Sets New Standard in Vision AI: SOTA Multimodal Capabilities for Documents, Images, and Video

According to @demishassabis, Gemini 3 Pro has established itself as a state-of-the-art (SOTA) vision AI model, outperforming previous systems across all major vision and multimodal benchmarks (source: Demis Hassabis, Twitter). Its robust multimodal capabilities enable advanced understanding of documents, screens, images, videos, and spatial data. These strengths allow businesses to deploy Gemini 3 Pro for diverse applications, including intelligent document processing, video analytics, and cross-modal data integration, presenting significant opportunities for enterprise automation and productivity gains (source: Demis Hassabis, Twitter).

Source
2025-06-11
17:00
Meta Unveils V-JEPA-v2: Advanced Self-Supervised Vision AI Model for Business Applications

According to Yann LeCun (@ylecun), Meta has released V-JEPA-v2, a new version of its self-supervised vision model designed to significantly improve visual reasoning and understanding without reliance on labeled data (source: @ylecun, June 11, 2025). V-JEPA-v2 leverages joint embedding predictive architecture, enabling more efficient training and better generalization across varied visual tasks. This breakthrough is expected to drive business opportunities in industries such as autonomous vehicles, retail analytics, and healthcare imaging by lowering data annotation costs and accelerating deployment of AI-powered vision systems.

Source